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Abstract  -  Dismounted  targets  can  be  tracked  in  ur¬ 
ban  environments  with  video  sensors.  Real-time  sys¬ 
tems  are  unable  to  process  all  of  the  imagery,  demand¬ 
ing  some  method  for  prioritization  of  the  processing 
resources.  Furthermore,  various  segmentation  algo¬ 
rithms  exist  within  image  processing,  each  algorithm 
possesses  unique  capabilities,  and  each  algorithm  has 
an  associated  computational  cost.  Additional  complex¬ 
ity  arises  in  the  prioritization  problem  when  targets  be¬ 
come  occluded  (i.e.,  a  building)  and  when  the  targets 
are  intermixed  with  other  dismounted  entities.  This 
added  complexity  leads  to  the  question  "which  por¬ 
tions  of  the  scene  warrant  both  low  cost  and  high  cost 
processing? "  The  approach  presented  in  this  paper  is 
to  apply  multi-target  tracking  techniques  in  conjunc¬ 
tion  with  an  integer  programming  optimization  routine 
to  determine  optimal  allocation  of  the  video  processing 
resources.  This  architecture  results  in  feedback  from 
the  tracking  routine  to  the  image  processing  function 
which  in  turn  enhances  the  ability  of  the  tracker. 
Keywords:  Tracking,  resource  allocation,  image  process¬ 
ing,  urban  dismount,  integer  programming. 

1  Introduction 

The  operational  impetus  for  the  research  presented 
here  is  the  need  to  track  dismounted  targets  in  urban 
environments.  Recent  efforts  by  the  Automatic  Tar¬ 
get  Recognition  (ATR)  Division  of  the  Air  Force  Re¬ 
search  Laboratory  Sensor’s  Directorate  have  led  to  the 
focus  on  video  based  sensing  in  order  to  accomplish 
urban  dismount  tracking.  Two  issues  that  naturally 
arise  in  these  types  of  scenarios  are  multiple  targets 


and  regions  of  measurement  occlusion  (i.e.,  targets  be¬ 
hind  buildings).  This  paper  addresses  these  specihc 
issues  by  combining  multi-target  tracking  with  an  im¬ 
age  processing  resource  allocation  algorithm.  Details 
of  the  methods  used  are  presented  along  with  prelim¬ 
inary  results  demonstrated  via  computer  simulation. 
To  accomplish  effective  multiple-target  tracking,  com¬ 
putational  complexity  is  affected  by  the  complexity  of 
the  tracker.  More  complex  systems  may  be  more  ro¬ 
bust,  but  demand  more  resources  from  the  computer. 
When  using  video-based  tracking,  an  additional  com¬ 
putational  complexity  arises  from  the  measurement 
generation  process  commonly  known  as  image  process¬ 
ing.  Real-time  systems  may  be  unable  to  process  all 
of  the  imagery  provided  by  the  sensor.  Furthermore, 
one  has  to  decide  which  segmentation  method  to  use 
on  the  image  as  each  method  possesses  unique  capabil¬ 
ities  and  each  method  has  an  associated  computational 
cost.  The  focus  of  this  research  is  the  development  of 
an  optimization  methodology  to  determine  the  alloca¬ 
tion  of  the  video  processing  resources.  A  key  aspect  of 
the  architecture  presented  is  the  information  exchange 
between  the  tracking  function  and  the  resource  alloca¬ 
tion  function.  Each  function  enhances  the  capability 
of  the  other,  and  a  feedback  loop  can  be  formed  for 
this  application. 

It  is  important  to  note  that  the  tracking  func¬ 
tion  consists  of  both  kinematic  and  attribute  tracking. 
Given  the  sensor  mode,  namely  imagery  generated  by 
video,  and  the  target  type,  namely  a  dismounted  tar¬ 
get,  attributes  will  play  a  major  role  in  the  ability  to 
maintain  kinematic  track  and  to  provide  target  ID.  At¬ 
tribute  measurement  data  may  be  provided  by  image 
segmentation  methods  such  as  color  mapping  (hue  and 
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saturation),  texture  mapping,  and  change  detection. 
These  three  methods  will  be  exploited  in  this  paper, 
but  the  overall  architecture  presented  is  easily  extend¬ 
able  to  other  methods.  The  attributes  provided  by 
color  and  texture  mapping  are  obvious,  and  the  spe¬ 
cific  method  used  in  the  simulation  will  be  discussed 
in  Section  3.  Change  detection  can  provide  both  kine¬ 
matic  and  attribute  data.  The  mass  of  a  detection  is 
considered  an  attribute,  and  the  center  of  this  mass 
provides  kinematic  position  data.  Further  detail  of  the 
tracking  algorithm  will  be  presented  in  Section  4. 

With  this  basic  understanding  of  the  measurement 
information  used  by  the  tracking  algorithm,  it  is  easy 
to  recognize  the  interaction  of  the  two  primary  func¬ 
tions.  Specifically,  the  kinematic  data  is  used  by  the 
resource  allocation  function  to  determine  regions  that 
should  receive  a  higher  allocation  of  the  image  process¬ 
ing  resources.  In  addition,  the  ability  of  the  resource 
allocation  routine  to  provide  the  optimal  set  of  at¬ 
tribute  measurements  to  the  tracking  algorithm  will 
ensure  enhanced  tracking. 

The  next  sections  provide  details  of  the  resource  al¬ 
location  method  followed  by  a  specific  implementation 
of  a  relatively  simple  tracking  algorithm  and  the  ver¬ 
sion  of  the  image  segmentation  methods  used  in  the 
simulation  study. 


2  Optimal  Resource  Allocation 


Figure  1:  Sub-region  Grid 


lish  the  minimal  valuation  criteria  to  obtain  a  meaning¬ 
ful  selection  within  the  optimization  routine.  Noting 
that  the  values  assigned  to  a  sub-region  will  be  based 
on  the  kinematic  data  gathered  from  previous  measure¬ 
ments,  it  is  clear  that  the  information  gained  from  a 
tracking  algorithm  will  be  used  to  drive  the  optimiza¬ 
tion  routine.  It  is  also  evident  that  the  competence  of 
the  algorithm  that  derives  the  values  for  sub-regions  is 
also  of  critical  importance  to  the  system. 


The  fundamental  questions  being  answered  by  the  re¬ 
source  allocation  algorithm  are  "which  regions  of  the 
image  should  have  priority  for  image  processing?"  and 
"which  of  the  various  segmentation  algorithms  should 
be  applied  in  these  regions?"  A  straightforward  method 
is  used  by  dividing  the  image  into  sub-regions  at  var¬ 
ious  levels  of  resolution.  This  results  in  a  grid  as  il¬ 
lustrated  in  Figure  I  for  a  21  sub-region  case.  In  this 
example,  the  first  16  sub-regions  represent  the  high  res¬ 
olution  regions,  17-20  are  medium  resolution,  and  21 
has  the  lowest  resolution.  Various  image  segmentation 
algorithms  may  be  employed  in  each  sub-region.  As 
such,  the  number  of  options  available  to  the  resource 
allocation  process  are  the  number  of  sub-regions  mul¬ 
tiplied  by  the  number  of  segmentation  methods.  A 
key  factor  in  choosing  which  options  to  execute  is  the 
benefit  associated  with  each  option.  There  are  varia¬ 
tions  in  the  costs  associated  with  processing  high  ver¬ 
sus  low  resolution  regions,  and  variations  in  the  costs 
of  each  type  of  segmentation  algorithm.  Given  these 
viewpoints,  it  is  clear  that  a  cost/benefit  based  opti¬ 
mization  can  be  performed. 

The  value  associated  with  each  sub-region  may  de¬ 
pend  on  many  factors.  The  primary  factors  in  this 
study  come  from  the  kinematic  states  of  the  tracking 
routine,  the  relative  value  of  the  segmentation  meth¬ 
ods,  and  the  target  ID  associated  with  a  given  track. 
The  distance  of  a  sub-region  relative  to  a  given  track’s 
position  will  be  considered  for  the  assignment  of  values 
to  each  sub-region.  The  full  computation  that  deter¬ 
mines  the  value  of  a  given  sub-region  will  be  described 
in  Section  4.4.  The  focus  of  this  paper  will  be  to  estab- 


Given  the  nature  of  the  binary  decisions  being  made 
an  obvious  choice  is  the  use  of  binary  integer  program¬ 
ming  to  search  this  solution  space.  Linear  integer  pro¬ 
grams  can  be  solved  by  a  number  of  conventional  so¬ 
lution  techniques  so  the  formulation  of  the  linear  pro¬ 
gram  is  the  critical  issue  [2].  As  with  the  gridding  and 
valuation  considerations,  the  integer  program  must  en¬ 
sure  that  it  enhances  the  capability  of  the  system  while 
reducing  the  overall  computational  load. 

The  primary  constraint  on  the  system  is  a  limitation 
on  how  much  time  it  should  take  to  complete  the  entire 
process.  With  knowledge  of  the  relative  time  require¬ 
ments  to  complete  the  available  segmentation  tasks, 
the  integer  program  can  be  devised  to  solve  the  prob¬ 
lem.  Other  system  limitations  can  be  implemented, 
which  may  vary  depending  on  the  segmentation  algo¬ 
rithms  available.  Gonstraints  are  described  as  arith¬ 
metic  equations  that  can  define  lines  or  planes  in  a 
space.  Each  of  the  sub-regions  in  conjunction  with  the 
set  of  segmentation  algorithms  are  considered  a  binary 
decision  variable.  Selecting  a  decision  variable  will  add 
a  value  to  the  solution  while  incurring  a  cost.  The  opti¬ 
mal  solution  will  be  obtained  when  the  maximum  value 
has  been  obtained  while  consuming  all  of  the  available 
time  to  perform  the  chosen  segmentation  algorithms. 

We  formally  define  the  binary  integer  program  as 
follows: 

maximize  fijdij 
such  that  ^dij  <  b 
and  Aeqdij  =  b, 


(1) 


where 


dij 


fi,j 

A,Aeq 

^7  ^eq 


binary  decision  variable  for  sub-region  i 
and  segmentation  method  j 
value  function  at  i,j 
constraint  matrices  for  limits  on  dij 
limit  vector  for  the  constraints 


Note  that  dij  is  a  vector  of  length  equal  to  the  number 
of  image  sub-regions  across  all  resolution  levels  {usub) 
multiplied  by  the  number  of  segmentation  methods 
{riseg)  considered.  A  detailed  description  of  the  com¬ 
putation  of  the  value  function,  fij,  is  deferred  until 
Section  4.4  due  to  its  dependence  on  information  pre¬ 
sented  later  in  the  paper.  The  inequality  constraints 
provide  upper  bounds  on  the  number  of  decision  vari¬ 
ables  with  a  value  of  1  and  thus  limit  the  number  of 
times  that  a  sub-region  and  segmentation  method  can 
be  selected.  The  structure  of  the  A-matrix  is  such  that 
the  first  row  in  conjunction  with  the  first  element  of 
the  limit  vector,  b,  will  provide  a  limit  on  the  number 
of  times  segmentation  method  #1  can  be  performed. 
A  similar  row  is  added  for  each  segmentation  method. 
Next,  a  row  is  added  for  each  sub-region  in  conjunc¬ 
tion  with  another  element  of  the  vector  b  to  provide  a 
limit  on  the  number  of  times  each  sub-region  can  be 
operated  on  at  each  sample  time.  The  dimensions  of 
the  A-matrix  will  be  (Usub  +  nseg)  x  {Usub  *  Useg)- 
The  second  set  of  constraints  is  used  to  ensure  that 
the  optimization  ensures  all  of  the  processing  time 
available  at  each  epoch  will  be  used.  This  is  accom¬ 
plished  by  setting  beq  equal  to  the  maximum  allowable 
cost  which  represents  the  total  processing  time  avail¬ 
able.  The  elements  of  Agg  are  based  on  the  time  re¬ 
quired  to  perform  each  of  the  possible  segmentation 
tasks.  The  form  of  A^q  is  given  by: 

[  Cl  Cl  C2  C2  C„,^g  C„^^g  ] 


where  Cs  =  the  cost  to  perform  segmentation  method 
s.  The  dimensions  of  the  Agg-matrix  will  be  1  x 

{P'sub  *  ‘b^seg)  ■ 

The  binary  integer  program  is  itself  a  computation¬ 
ally  intensive  mechanism.  To  ensure  that  this  method 
does  not  merely  trade  one  numerically  intensive  mech¬ 
anism  for  another,  there  are  means  that  can  be  em¬ 
ployed  to  reduce  the  overall  size  of  the  integer  program. 
A  form  of  gating  in  which  sub-regions  may  be  chosen 
based  on  their  likelihood  of  providing  information  may 
be  used.  These  sub-regions  would  be  the  only  ones 
available  for  consideration  within  the  integer  program. 
This  method  would  require  increased  confidence  in  the 
tracker  and  assumes  that  new  targets  are  not  entering 
the  image  space.  This  may  be  accomplished  by  apply¬ 
ing  wide  area  change  detection  at  low  resolution,  then 
executing  the  integer  program  given  the  resources  that 
remain. 


3  Image  Processing 

Various  segmentation  methods  can  be  applied  to  2-D 
imagery  for  the  purpose  of  extracting  measurements 


that  are  processed  by  a  tracking  algorithm.  Although 
this  study  focuses  on  three  such  methods,  it  is  im¬ 
portant  to  note  the  resource  allocation  function  and 
the  overall  tracking  architecture  can  readily  accept  any 
number  of  segmentation  methods.  The  three  methods 
mentioned  previously  are  change  detection,  color  map¬ 
ping,  and  texture  mapping. 

Change  detection  is  computationally  inexpensive 
and  is  a  pixel  by  pixel  binary  indication  that  a  change 
has  occurred  in  the  image.  The  fundamental  con¬ 
cept  is  to  compare  frames  of  imagery  and  detect  vari¬ 
ations  from  one  frame  to  the  next.  Simply  declaring  a 
change  based  on  two  frames  of  data  is  susceptible  to 
noise  and  stationary  objects  with  second  order  move¬ 
ments.  Multi-frame  smoothing  may  be  applied  to  filter 
out  these  movements,  and  focus  on  moving  objects  of 
greater  interest  to  the  tracker.  Given  that  the  change 
detection  of  a  group  of  contiguous  pixels  dehnes  a  mov¬ 
ing  object,  there  are  two  main  measurements  that  can 
be  utilized.  First,  the  mass  of  the  object  is  defined  by 
the  number  of  pixels  in  the  group  and  thus  provides 
an  attribute  type  measurement.  Second,  the  center  of 
mass  of  the  pixels  dehnes  the  kinematic  location  of  the 
object.  Both  of  these  measurements  will  be  provided 
to  the  tracking  algorithm  as  described  in  the  next  sec¬ 
tion. 

Color  mapping  uses  a  combination  of  hue  and  satu¬ 
ration  levels  for  an  object  within  a  sub-region.  A  hnite 
discrete  number  of  bins,  Ubins,  is  established  to  rep¬ 
resent  the  possible  variations  of  color.  For  this  study, 
nbins  =  30,  resulting  in  a  set  of  30  distinguishable  hue 
levels,  and  30  more  bins  are  used  to  distinguish  satura¬ 
tion  levels.  Each  group  of  bins  represents  an  attribute 
measurement.  The  computation  of  hue  and  saturation 
values  are  done  pixel-by-pixel  using  the  transformation 
[3]: 


hue  —  cos 


1  [(i^  -  C)  +  (i^  -  B)] 

2  ^{R-Gy  +  {R  -B)iG-B) 


sat  =  1  — 


3min(i?,  G,B) 
(R  +  G  +  B) 


where  R,  G,and  B  represent  the  values  of  red,  green, 
and  blue  present  in  each  pixel.  This  will  provide  a  value 
from  0  to  1  for  each  pixel  for  both  hue  and  saturation. 
The  value  space  is  discretized  into  30  evenly-spaced 
bins  for  this  experiment.  The  same  binning  process  is 
used  for  saturation  data. 

The  intensity  component  of  an  image  has  the  char¬ 
acteristic  of  being  useful  for  describing  an  image’s  tex¬ 
ture.  Intensity  is  defined  as: 


I=^{R  +  G  +  B) 


(4) 


The  spatial  characteristics  of  texture  can  be  calculated 
using  the  Fourier  transform.  This  provides  the  abil¬ 
ity  to  distinguish  between  periodic  and  non-periodic 
patterns  and  to  quantify  differences  between  periodic 
patterns.  Spectral  measurements  are  a  result  of  the 
2-dimensional  Fast  Fourier  Transform  (FFT)  of  the 


image.  This  FFT  produces  a  function  of  the  image, 
S{r,9),  in  polar  coordinates.  Two  1-dimensional  sum¬ 
mations  of  this  function  provide  two  additional  at¬ 
tributes  through: 


zero-mean  white  noise  processes  such  that: 

if{w(t)w(t  +  r)^}  =  Q(5(t) 


TT 


S{r)  =  ^5,(r) 

(5) 

9^0 

Ro 

3(9)  =  ^5,(0) 

(6) 

r—1 


where  Rq  is  the  radius  of  a  circle  centered  at  the  origin. 


4  Target  Tracking 

As  previously  mentioned,  the  method  employed  to 
track  targets  is  a  combination  of  kinematic  and  at¬ 
tribute  tracking.  Kinematic  tracking  here  assumes  a 
2-D  space  within  which  the  targets  may  move.  An  oc¬ 
clusion  is  defined  as  a  subset  of  the  target  space  where 
measurements  are  not  available,  but  where  the  target 
may  move  freely.  States  used  to  represent  the  kine¬ 
matic  and  attribute  data  can  be  decoupled  based  on 
the  following  discussion.  Attributes  including  color, 
texture,  and  mass  do  not  effect  the  target  dynamics 
model.  In  many  cases,  attributes  lead  to  target  ID, 
which  may  be  correlated  to  target  dynamics.  However, 
our  only  targets  of  interest  are  urban  dismounts,  so  the 
target  dynamics  are  assumed  identical.  Similarly,  tar¬ 
get  position  and  velocity  will  not  directly  effect  target 
color,  texture,  or  mass.  This  decoupling  provides  con¬ 
siderable  simplification  for  the  target  tracker. 


4.1  Kinematic  Tracking 

Consider  the  kinematic  states  of  position  and  velocity 
^kin(^)  =  [Px  Vx  Py  Vy]'^,  the  constant  velocity  model 
is  given  by  according  to  the  following  constant  velocity 
dynamics  model  [4]: 


Xki„(fc) 


+Gd{k-  l)wd(fc-  1) 

1  r  0  0 
0  10  0 
0  0  1  T 

0  0  0  1 


2 

T 

0 

0 


Xk,„(fc-1)  + 


0 

0 

■J^2 

T 


Wd{k  -  1) 


(7) 

(8) 


where  T  is  the  time  between  measurement  points  {k  — 
1)  and  k.  The  measurements  consist  of  a  two-element 
vector  related  to  the  state  vector  by  the  expression: 


Zkin(fc) 


Hx^.  (fc)  +  v(fc) 


10  0  0 
0  0  10 


Xk,„(^)+v(/c) 


(9) 


where  T  is  the  time  between  measurement  intervals 
(fc— 1)  and  (fc),  and  w(fc)  and  v(A;)  are  two  independent 


Given  this  simple  target  model,  linear  Kalman  filter¬ 
ing  will  be  used  for  the  kinematic  tracker.  The  classi¬ 
cal  method  of  state  propagation  between  measurement 
updates  is  conducted,  while  measurement  updates  may 
not  occur  at  every  sample  time  due  to  occluded  targets. 
As  such,  during  periods  of  prolonged  occlusion,  the 
target  track  errors  will  grow.  The  reappearance  of  the 
target  on  a  boundary  of  the  occlusion  would  provide  a 
measurement  update  for  the  existing  track.  However, 
there  is  no  guarantee  that  the  target  will  reappear  by 
the  time  the  propagated  track  reaches  a  boundary  of 
the  occlusion.  This  leads  the  to  need  to  manage  track 
initiation  and  deletion.  Candidate  tracks  are  initiated 
via  change  detections  at  locations  equal  to  the  center 
of  mass  of  the  detection.  A  goal  of  the  tracking  routine 
is  to  maintain  track  of  any  high-valued  targets.  More 
than  one  track  may  exist  since  we  cannot  assume  that 
the  first  detected  change  represents  a  high-valued  tar¬ 
get.  This  essentially  represents  multiple  hypotheses  in 
the  context  of  many  candidate  tracks  competing  for  the 
role  as  the  high- valued  target.  Note  again  that  the  re¬ 
source  allocation  concept  is  not  dependent  on  the  idea 
of  a  high-valued  target,  and  it  could  still  provide  opti¬ 
mal  processing  decisions  for  multiple  high-valued  tar¬ 
gets.  Standard  tracking  concepts  such  as  track  gating 
and  M/N  initiation  may  help  to  filter  out  unrealistic 
new  tracks.  Track  deletion  techniques  such  as  track 
scoring  may  be  used  to  remove  tracks  of  targets  that 
are  believed  to  have  left  the  scene. 


4.2  Attribute  Tracking 

The  attribute  state  vector  represents  a  concatenation 
of  four  types  of  attribute  data  as  follows: 


^att 


SCl^l  —  rihins 

S{r) 

S{e) 

mass 


(10) 


where  huei-nbi„s  are  vectors  of  dimen¬ 

sion  ribins  X  1  representing  the  30  bins  of  hue  and  satu¬ 
ration  data  respectively.  The  remaining  attributes  are 
scalars  as  defined  in  Section  3.  Inadequate  knowledge 
of  the  statistical  nature  of  the  target  state  and  the 
measurements  leads  to  a  desire  to  use  a  sub-optimal 
approach  to  perform  attribute  updates.  Measurement 
updates  are  executed  based  on  the  simplistic  and  un¬ 
derstandably  ad  hoc  method  of  a  mixture  weighting 
given  by  [5]: 


X!,tt{k)  =  axatt{k  -  1)  -t-  (1  -  a)z^^^{k)  (11) 

where  the  weight  a  is  empirically  chosen.  The  at¬ 
tribute  states  are  initialized  with  the  first  measurement 
of  each  type,  and  those  attributes  lacking  any  mea¬ 
surements  take  on  a  null  value  to  prevent  an  erroneous 
target  ID. 


4.3  Target  ID 

Target  ID  is  mapped  to  three  discrete  values: 
Desired,  Decoy,  Unknown.  The  features  of  the  de¬ 
sired  target,  denoted  here  as  "true",  must  be  set  a  pri¬ 
ori  or  determined  by  prolonged  observation.  Bayesian 
inference  is  used  to  compute  the  probability  of  ID, 
PiD  =  p{xi\Vk),  for  each  track  as  follows  [6]: 


p{ci\Vk) 


P  {Vk\ci)  p{ci) 

'^P{yk\cj)p{cj) 

3 

p  {Vk\ci)  p{ci\Vk-i) 
^P{vk\cj)picj\Vk-i) 

3 


(12) 


where  the  discrete  transitional  density,  p{Vk\xi),  is 
based  on  attribute  data  using  a  simple  binary  voting 
method  that  represents  the  set  intersection  of  the  var¬ 
ious  features.  Note  that  Eq.  (12)  assumes  indepen¬ 
dence  of  the  measurements  in  the  sense  that: 


k 

p{Vk\c)  =p{vk,Vk-i,...,vi\c)  =  ]^p('yi|c)  (13) 

i=l 


The  declaration  of  target  ID  is  made  using  maximum 
a  posteriori  (MAP)  inference  at  a  given  sample  time 
via  [6]: 


ID  =  arg  |mpp(ci|14)|  (14) 

This  hard  decision  is  acceptable  for  many  applications. 
It  answers  the  question  of  "What  class  of  target  is  it?" 
Given  the  four  types  of  attributes  defined  previously, 
a  target  can  receive  0  to  4  votes  to  determine  if  the 
target  matches  the  features  of  the  desired  target.  The 
transitional  density,  p  {vk\xi),  will  take  on  discrete  val¬ 
ues  based  on  the  number  of  "yes"  votes  received.  For 
example: 


Votes 

Decoy 

Unknown 

Desired 

0 

'0.38 

0.33 

0.29‘ 

1 

0.35 

0.34 

0.31 

2 

0.34 

0.35 

0.31 

3 

0.33 

0.34 

0.33 

4 

0.28 

0.33 

0.39 

The  probability  of  ID  can  then  be  computed  at  time  k 
using  Eq.  12  starting  from  an  initial  condition  without 
bias  such  as:  Pjd  =  [0.33  0.33  0.33]  for  Ci  =  [Decoy 
Unknown  Desired]. 

The  required  voting  is  based  on  metrics  specific  to 
each  type  of  attribute.  In  the  case  of  mass,  a  distance 
metric  is  given  by 

Amassi  =  masstme  —  massi  (16) 

for  track  i  and  the  desired  target’s  "true"  mass.  If 
jAmasSij  <  e,  where  e  is  an  empirical  threshold,  then 
track  i  receives  a  "yes"  vote  as  a  match. 

Hue  and  saturation  are  represented  by  vectors  of  di¬ 
mension  nuns  X  1,  lending  themselves  to  a  correlation 


metric.  For  each  vector,  the  pairwise  linear  correla¬ 
tion  coefficient  is  computed  between  the  "true"  and 
"track"  hue  and  saturation  vectors.  A  correlation  co¬ 
efficient  arbitrarily  close  to  one  receives  a  "yes"  vote  as 
a  match.  The  same  approach  is  applied  to  the  texture 
statistics,  [S'(r),  S{6)],  requiring  that  the  correlation 
between  both  measurements  and  their  "true"  values 
lie  arbitrarily  close  to  one  in  order  to  receive  a  "yes" 
vote  for  the  texture  component. 

4.4  Value  Function  fij 

Obtaining  the  optimal  solution  relies  upon  using  the 
available  information  to  properly  set  the  values  that 
drive  the  integer  program.  The  factors  which  influ¬ 
ence  the  value  function  used  in  the  integer  program  are 
based  on  the  kinematics  states  of  the  tracking  routine, 
the  relative  value  of  the  segmentation  methods,  and 
the  target  ID  associated  with  a  given  track.  These  are 
combined  for  each  combination  of  target,  sub- region, 
and  segmentation  using  the  expression: 

fi,j,k  disti  j  k  '  kik  '  mj  k  '  3 j 

Each  of  the  coefficients  will  be  described  presently. 

The  value  distij^k  represents  the  relative  value  of 
the  target  appearing  at  a  given  location.  In  order  to 
generate  a  Gaussian-like  weighting  of  the  value,  the 
error  distance,  pf  j  k^  is  used  as  the  argument  of  the 
error  function  given  by: 

2  2  2 

disti  j  k  = /  e~*  dt  (18) 

’  ’  ^ 

Where 

Pi,j,k  —  \J {Pxk  ~  +  {Pvk  ~  ^Vi) 

with  Sxi  and  Sy.  representing  the  center  of  sub-region 

i. 

Next,  the  target  ID  is  used  to  provide  a  scale  factor, 
Kk,  is  applied  for  target  k  in  order  to  put  emphasis  on 
a  particular  type  of  classification.  Based  on  a  set  of 
possible  target  ID’s,  the  value  Kk  may  be  defined  as: 

r  2,  iD  =  i 

Kk=  I  5,  ID  =  2  (20) 

[3,  ID  =  3 

The  values  for  Kk  are  arbitrary  and  represent  an  ad 

hoc  tuning  parameter  within  the  value  assignment  con¬ 
straint.  A  scaling  mj^k  is  given  by  the  ratio  between 
the  number  of  times  any  segmentation  j  has  been  per¬ 
formed  on  a  given  track  k  to  the  number  of  times  any 
other  operation  has  been  performed  to  this  same  track. 


This  weight 

can  be  illustrated  as: 
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‘knj,k  —  > 
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N2,k  >  A's.fe) 
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II 

1  N2.k' 

N2,k  <  A's.fc) 

CO 

II 

where  Nj^k  is  the  number  of  times  the  segmentation 
has  been  performed  on  the  k*^  target. 


Figure  2:  System  with  Binary  Integer  Program 

For  the  three-segmentation  case  used  here,  a  rela¬ 
tive  weight,  j3,  is  now  applied  to  discern  the  benefit 
of  performing  one  type  of  processing  over  the  other 
in  terms  of  the  benefit  obtained  in  classification  that 
results  from  this  type  of  processing.  For  example, 

{1,  j  =  1,  mass 

2,  j  =  2,  color 

3,  J  =  3,  texture 

The  values  f3j  are  arbitrary  and  represent  an  addi¬ 
tional  tuning  parameter  within  the  value  assignment 
construct.  Finally,  since  the  goal  is  to  determine  a 
value  for  each  sub-region  and  each  segmentation  al¬ 
gorithm,  we  will  sum  the  values  associated  with  each 
target  using  the  expression: 

(22) 

k 

to  arrive  at  an  objective  function  coefficient  for  each 
decision  variable  in  the  BIP. 


5  System  Overview 

The  goal  of  this  paper  is  to  achieve  optimal  resource 
allocation  of  the  image  processing  resources  available 
to  the  computer.  The  optimal  solution  itself  incurs  a 
computational  cost,  so  an  analysis  of  the  performance 
of  a  system  computing  an  optimal  solution  should  be 
compared  with  another,  simpler  allocation  method. 
Several  components  are  used  to  combine  the  image 
processing,  resource  allocation,  and  target  kinematic 
and  attribute  tracking.  The  program  implemented  di¬ 
vides  the  work  into  routines  that  perform  each  of  the 
necessary  tasks  and  provides  information  to  the  next 
task.  This  construct  is  illustrated  in  Figures  2  and 
3.  The  purpose  of  the  Logic  Algorithm  system  is  to 
provide  a  means  to  compare  the  optimal  resource  al¬ 
location  system  with  a  system  that  provides  resource 
allocation  by  making  simple  logic  decisions.  The  per¬ 
formance  of  the  Integer  Program  system  will  be  com¬ 
pared  to  the  Logic  system  to  determine  its  effectiveness 
in  terms  of  its  ability  to  provide  proper  target  classifi¬ 
cation  and  reduced  resource  consumption. 


Figure  3:  System  with  Logic  Algorithm 

6  Simulation  Study 

A  simulation  study  was  conducted  containing  three 
targets  moving  through  a  scene  to  illustrate  the  va¬ 
lidity  of  the  algorithms  proposed  in  this  paper.  Two 
scenarios  were  run.  One  consisted  of  57  time  epochs 
(57  frames  of  imagery)  and  the  other  contained  48  time 
epochs.  The  targets  are  soldiers  with  different  uni¬ 
forms,  moving  across  a  fixed  scene  containing  occlu¬ 
sions.  Although  all  of  the  targets  are  very  similar  in 
size,  their  color  and  texture  attributes  differ  somewhat. 
Two  of  the  three  targets  are  essentially  distractors,  and 
it  is  desirable  to  track  the  single  high- valued  target. 
The  other  targets  (decoys)  generate  a  computational 
load  on  the  image  processing,  requiring  some  method 
for  prioritizing  the  resource  allocation.  When  targets 
enter  an  occluded  region  they  no  longer  generate  mea¬ 
surements  or  attribute  data,  but  they  may  re-emerge 
at  a  later  time.  While  it  is  desired  to  be  able  to  estab¬ 
lish  a  target’s  ID  at  a  time  before  it  enters  an  occlusion, 
the  ultimate  purpose  of  this  study  is  to  be  able  to  es¬ 
tablish  a  positive  identification  on  the  "high- valued" 
target  when  it  emerges  from  an  occlusion  event.  As 
the  targets  move  through  the  scene,  they  are  classified 
according  to  their  attributes.  There  will  be  situations 
that  may  cause  improper  classification.  Targets  may 
move  so  close  they  merge  onto  each  other,  they  may  be¬ 
come  partially  occluded  behind  an  obstacle,  and  their 
paths  may  cross.  Simulations  will  be  processed  using 
two  types  of  allocation  algorithms  that  select  which 
sub-regions  to  use  for  image  processing.  One  model  is 
driven  by  the  binary  integer  program  and  will  be  re¬ 
ferred  to  as  "BIP".  The  other  will  make  simple  logic- 
based  decisions  driven  solely  by  the  targets’  kinematic 
states.  This  will  be  referred  to  as  "Logic,"  with  the 
principal  idea  was  to  generate  a  competing  algorithm 
for  the  BIP  concept.  The  logic  algorithm  will  order 
change  detection  around  each  target,  and  a  combina¬ 
tion  of  color  and  texture  processing  is  accomplished  in 
and  around  each  change  detection.  The  Logic  concept 
represents  a  greedy  algorithm,  which  is  not  constrained 
on  a  global  level.  In  other  words,  it  is  allowed  order 
processing  (in  the  vicinity  of  a  change)  without  regard 
for  the  total  processing  being  conducted.  In  contrast, 
when  multiple  targets  are  competing  for  resources,  the 


Table  1:  Resources  Consumed  by  Allocation  Methods 


Scenario  1 

BIP 

Mass 

Color 

Texture 

Averages: 

1809.8 

187.9 

337.3 

Tokens: 

1809.8 

751.6 

921 

Total 

Tokens: 

3482.4 

Logic 

Averages: 

1419.7 

337.3 

330.4 

Tokens: 

1419.7 

1349.2 

1652 

Total 

Tokens: 

4420.9 

%  Savings: 

-27.4 

44.2 

44.2 

Net 

Savings: 

21.2% 

Scenario  2 

BIP 

Mass 

Color 

Texture 

Averages: 

2178.3 

229.6 

228 

Tokens: 

2178.3 

918.4 

1140 

Total 

Tokens: 

3482.4 

Logic 

Averages: 

1602.5 

334.8 

339.7 

Tokens: 

1602.5 

1339.2 

1698.5 

Total 

Tokens: 

4640.2 

%  Savings: 

-35.9 

21.4 

32.9 

Net 

Savings: 

8.7% 

BIP  divides  limited  resources  among  the  various  tar¬ 
gets  on  a  global  scale. 

A  short  Monte  Carlo  analysis  of  10  runs  was  per¬ 
formed  to  determine  general  characteristics  of  the  two 
resource  allocation  methods.  When  using  basic  filter¬ 
ing  on  the  images,  it  was  determined  that  it  takes 
about  5  times  longer  for  a  texture  measurement  and 
about  4  times  longer  for  a  color  measnrement  than  a 
mass  measurement.  A  token  is  defined  as  the  amount 
of  time  to  perform  a  mass  measurement.  Therefore, 
for  each  segmentation,  j,  these  relative  token  costs  are 
called  Tj.  For  each  segmentation  that  is  performed 
Vj  times  per  epoch,  the  total  number  of  tokens,  tf, 
required  to  perform  an  epoch’s  image  processing  is  de¬ 
fined  as: 

3 

^  =  (23) 

i=i 

This  will  be  used  to  determine  the  relative  cost  for  the 
simulations.  Table  1  shows  the  amount  of  resources 
consumed  using  each  of  the  allocation  methods  for  the 
two  scenarios.  From  this  table,  it  can  be  seen  that 
the  BIP  algoritm  uses  fewer  resources  than  the  Logic 
algorithm.  It  must  be  recognized  that  the  computa¬ 
tional  load  of  implementing  the  BIP  algorithm  is  not 
included  here. 

The  classification  accuracy  was  computed  for  each 
scenario  by  counting  how  many  times  the  correct  or  in¬ 
correct  classihcation  was  applied  to  each  visible  target 
at  each  epoch.  These  totals  are  provided  in  Table  2. 
In  scenario  1,  there  are  30  opportunities  for  a  correct 
desired  identification  in  each  run,  and  48  opportunities 
for  a  correct  decoy  identification.  Scenario  2  contains 
27  opportnnities  for  a  correct  desired  identihcation  and 


Table  2:  Classihcation  Quantities 


Scenario  1 

True 

False 

Pos. 

Neg. 

Pos. 

Neg. 

BIP 

73 

426 

54 

227 

Logic 
Scenario  2 

43 

435 

45 

257 

BIP 

100 

391 

59 

170 

Logic 

63 

424 

26 

207 

Cd\Td 

Cn\Tn 

Cd\Tn 

Cn\Td 

Table  3:  Overall  Classihcation  Accuracy 
Scenario  1 


BIP 

Logic 

Best 

P{Td\Cd) 

0.575 

0.500 

BIP 

PiTNlCo) 

0.425 

0.500 

BIP 

P{Td\Cn) 

0.348 

0.37 

BIP 

PiTNlC^) 

0.652 

Scenario  2 

0.63 

BIP 

P{Td\Cd) 

0.629 

0.708 

Logic 

PiTNlCo) 

0.371 

0.292 

Logic 

PiTolCN) 

0.303 

0.328 

BIP 

0.697 

0.672 

BIP 

45  opportunities  for  a  correct  decoy  identihcation. 

Given  the  information  presented  in  Table  2,  it  is 
possible  to  determine  the  probability  of  a  target  be¬ 
ing  of  one  of  the  two  types,  desired  or  decoy,  given  a 
classihcation.  There  are  four  permutations  that  can 
be  measured.  These  are  the  probabilities  of  having  de¬ 
sired  or  decoy  targets,  given  that  there  is  a  positive  or 
negative  classihcation.  Conditional  probabilities  can  be 
calculated  via: 

P{T.\Cy)  =  (24) 

where  indicates  a  target  truly  being  desired  or  decoy 
and  Cy  indicates  a  desired  or  decoy  classihcation. 

The  values  in  Table  3  indicate  the  probabilities  given 
by  Eq.  (24).  The  variable  To  indicates  a  (D)esired  tar¬ 
get,  while  Tn  indicates  the  (N)on-desired  or  decoy  tar¬ 
get.  The  Cd  indicates  a  (D)esired  classihcation,  while 
Cn  indicates  the  (N)on-desired  or  decoy  classihcation. 
The  algorithm  that  provides  the  better  performance  for 
each  case  is  indicated  in  the  right  column  of  each  of  the 
scenarios’  information.  The  probabilities,  P(T;v)  and 
P{T]j)  are  determined  by  the  total  number  of  times 
each  of  the  target  types  appears  in  the  scene,  divided 
by  the  total  appearances  of  all  targets.  The  proba¬ 
bilities  P{Cd)  and  P{Cn)  are  the  relative  chance  of 
each  of  the  two  classihcations  occurring,  and  are  com¬ 
puted  as  hnding  the  number  of  times  each  classihcation 
is  made  divided  by  the  total  number  of  classihcations 
made. 


7  Conclusions 


Both  resource  allocation  methods  studied  here  provide 
acceptable  and  relatively  similar  results  in  terms  of  cor¬ 
rectly  classifying  targets.  The  BIP  method  provides 
improved  performance  in  terms  of  computational  load 
over  the  simple  Logic  method.  This  benefit  does  come 
at  a  cost.  A  valuation  function  must  accurately  de¬ 
scribe  the  value  of  collecting  information  on  a  given 
target,  and  a  BIP  that  is  too  complex  may  not  be  able 
to  determine  resource  allocation  quickly.  It  is  appar¬ 
ent  that  the  study  of  BIP’s  choices  can  be  used  to  drive 
the  development  of  other  logic-based  algorithms  that 
will  function  quickly  as  well  as  provide  comparable  re¬ 
sults.  Certainly,  such  an  ad-hoc  logic  set  can  be  em¬ 
ployed,  with  adequate  performance  as  demonstrated 
here.  This  leads  to  the  concept  of  using  the  BIP  to 
help  in  the  development  of  logic-based  algorithms  for 
online  use,  which  is  the  basis  for  future  work  in  this 
area.  A  different  method  of  approaching  the  problem 
may  provide  a  more  robust  and  reliable  system.  Such  a 
resource  allocation  system  would  be  two-layered.  The 
first  layer  of  the  system  would  determine  how  many 
system  resources  at  a  given  time  can  be  allocated  to 
various  areas  of  the  problem.  This  would  be  based  on 
standard  linear  programming  techniques  and  provide 
information  such  as  what  image  processing  algorithms 
to  utilize  and  which  targets  should  be  examined.  The 
second  layer  would  use  simple  logic  to  determine  where 
to  apply  the  resources  that  are  allowed  at  each  epoch. 
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